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Abstract — A  harvest-use-store  power  splitting  (PS)  relaying 
strategy  with  distributed  beamforming  is  proposed  for  wireless- 
powered  multi-relay  cooperative  networks  in  this  paper.  Dif¬ 
ferent  from  the  conventional  battery-free  PS  relaying  strategy, 
harvested  energy  is  prioritized  to  power  information  relaying 
while  the  remainder  is  accumulated  and  stored  for  future 
usage  with  the  help  of  a  battery  in  the  proposed  strategy, 
which  supports  an  efficient  utilization  of  harvested  energy. 
However,  PS  affects  throughput  at  subsequent  time  slots  due  to 
the  battery  operations  including  the  charging  and  discharging. 
To  this  end,  PS  and  battery  operations  are  coupled  with 
distributed  beamforming.  A  throughput  optimization  problem 
to  incorporate  these  coupled  operations  is  formulated  though  it 
is  intractable.  To  address  the  intractability  of  the  optimization, 
a  layered  optimization  method  is  proposed  to  achieve  the 
optimal  joint  PS  and  battery  operation  design  with  non-causal 
channel  state  information  (CSI),  in  which  the  PS  and  the 
battery  operation  can  be  analyzed  in  a  decomposed  manner. 
Then,  a  general  case  with  causal  CSI  is  considered,  where 
the  proposed  layered  optimization  method  is  extended  by 
utilizing  the  statistical  properties  of  CSI.  To  reach  a  better 
tradeoff  between  performance  and  complexity,  a  greedy  method 
that  requires  no  information  about  subsequent  time  slots  is 
proposed.  Simulation  results  reveal  the  upper  and  lower  bound 
on  performance  of  the  proposed  strategy,  which  are  reached  by 
the  layered  optimization  method  with  non-causal  CSI  and  the 
greedy  method,  respectively.  Moreover,  the  proposed  strategy 
outperforms  the  conventional  PS-based  relaying  without  energy 
accumulation  and  time  switching-based  relaying  strategy. 

Index  Terms — Wireless-powered  communication,  power  split¬ 
ting,  harvest-use-store,  channel  state  information. 
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I.  Introduction 

The  fifth  generation  (5G)  communication  networks  are 
expected  to  support  new  emerging  services  with  high  network 
capacity,  as  well  as  a  reduced  delay  and  energy  consump¬ 
tion.  To  achieve  these  requirements,  the  use  of  super-dense 
small  cell  deployments  and  centralized  resource  management, 
i.e.  cloud  radio  access  network,  is  becoming  an  appealing 
approach  IS,  II-  However,  to  fulfill  the  desired  coverage, 
some  wireless  nodes  in  the  5G  network  might  need  to 
be  deployed  in  places  lacking  an  external  power  supply. 
To  this  end,  energy  harvesting  approaches  that  scavenge 
energy  from  the  ambient  environment  are  recognized  as  a 
key  enabling  technology  for  these  self-sustainable  nodes  0. 
Meanwhile,  cooperative  relay  communication  is  a  promising 
approach  to  enlarge  coverage  and  improve  spectral  efficiency. 
Therefore,  enabling  cooperative  relay  communications  via 
energy  harvesting  is  becoming  a  popular  concept  for  green 
communication,  which  aims  at  decreasing  power  usage,  while 
improving  the  transmission  performance. 

A  key  concern  of  the  energy  harvesting  enabled  coop¬ 
erative  relay  communication  is  the  efficient  utilization  of 
harvested  power,  which  is  not  steadily  replenished  as  in 
traditional  grid-aided  communication  networks.  The  issue  of 
improving  transmission  performance  via  an  efficient  utiliza¬ 
tion  of  harvested  power  has  been  widely  studied  for  conven¬ 
tional  energy  harvesting  techniques,  where  natural  resources, 
such  as  solar,  wind  etc.  are  used  as  energy  sources  0,  0. 
However,  the  intermittent  and  unpredictable  nature  makes 
these  sources  difficult  to  exploit  in  certain  environments.  As 
an  alternative  to  the  conventional  energy  harvesting  tech¬ 
niques,  radio-frequency  (RF)  energy  harvesting  techniques 
are  believed  to  fully  unleash  the  potential  gains  of  energy 
harvesting,  in  which  RF  signals  transmitted  from  the  source 
node  can  be  used  as  energy  sources  for  cooperative  nodes. 
Moreover,  it  has  been  illustrated  in  |6)  that  wireless-powered 
cooperative  relay  communications  can  be  realized  within  a 
boundary  distance,  which  is  determined  by  both  the  available 
transferred  energy  and  the  minimum  energy  requirements 
of  the  harvesting  devices.  As  a  result,  the  wireless-powered 
cooperative  relay  communication  is  formed  and  can  be 
regarded  as  a  promising  solution  to  highly  energy-efficient 
networking.  Note  that,  due  to  the  dual-purpose  of  RF  signals, 
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i.e.,  wireless  power  transfer  (WPT)  and  wireless  information 
transfer  (WIT),  fundamental  changes  to  the  designs  of  green 
communication  networks  are  entailed. 

A.  Related  Works 

As  concluded  in  CD,  energy  harvesting  receiving  archi¬ 
tectures  and  power  management  models  are  two  essential 
units  to  realize  wireless-powered  cooperative  relay  commu¬ 
nications.  The  energy  harvesting  unit  is  for  energy  collection, 
and  there  are  mainly  two  types  of  energy  harvesting  receiving 
architectures  in  the  literature.  In  particular,  the  first  type  is 
based  on  power  splitting  (PS)  technique,  which  splits  the 
received  RF  signal  into  two  different  power  streams  for 
separate  WPT  and  WIT,  and  the  other  is  based  on  time 
switching  technique,  where  the  received  signal  at  one  time 
slot  is  used  for  either  WPT  or  WIT  (§].  Note  that,  the 
signal  received  at  one  time  slot  is  used  for  both  information 
processing  and  power  transfer  in  the  PS  technique,  which  is 
not  allowed  in  the  time  switching  technique  IfTOl.  Therefore, 
the  time  switching  technique  is  suboptimal  in  terms  of 
efficiently  using  the  available  signal  power  ©,  and  the  PS 
technique  is  more  suitable  for  applications  with  critical  delay 
constraints  ED.  Besides,  the  power  management  unit  aims  at 
utilizing  the  harvested  power  effectively.  Most  of  the  research 
works  conducted  so  far  assumes  either  harvest-use  or  harvest- 
store-use  model.  In  the  former  model,  the  harvested  energy 
is  directly  used  and  energy  accumulation  is  not  allowed 
due  to  lack  of  storage  units,  while  in  the  later  model,  the 
harvested  energy  is  first  accumulated  and  stored  with  the 
help  of  a  battery  and  then  adaptively  utilized  in  information 
transmission  0. 

The  incorporation  of  harvest-use  model  and  PS-based 
relaying  has  been  studied  in  m-m,  where  harvested 
energy  through  PS  is  used  up  at  each  information  relaying. 
In  particular,  in  im,  a  harvest-use  PS-based  relaying  strat¬ 
egy  was  realized  in  two-way  amplify-and-forward  relaying 
systems,  and  the  outage  probability  and  ergodic  capacity 
were  analyzed.  In  ED,  a  multiple  relay  system  was  con¬ 
sidered,  where  several  randomly  located  energy  harvesting 
relays  help  the  transmission  between  a  source-destination 
pair.  The  proposed  strategy  was  shown  to  achieve  the  same 
diversity  gain  as  the  case  with  conventional  self-powered 
relays.  The  multiple  antenna  configurations  were  considered 
in  tm  where  a  sophisticated  relaying  strategy  that  jointly 
utilizes  harvest-use  PS  and  antenna  selection  was  designed 
and  optimized  to  improve  the  achievable  rate.  Besides,  the 
harvest-use  time  switching-based  relaying  was  studied  in 
multi-tier  uplink  cellular  networks  in  lfl4l.  where  each  user 
transmits  only  when  the  harvested  energy  at  one  time  slot 
is  sufficient  for  the  designed  power  control  scheme,  and  no 
user  can  save  the  extra  harvested  energy  for  the  next  time 
slot.  Though  the  harvest-use  model  is  easy  to  implement,  it 
would  perform  better  if  energy  accumulation  is  allowed  to 
store  a  part  of  the  harvested  power  for  future  usage  ED, 
which  is  known  as  the  harvest-store-use  model.  This  model 


has  been  incorporated  with  time  switching-based  relaying  in 
three-node  relaying  networks  in  1(131.  where  data  relaying  was 
realized  when  sufficient  power  was  collected  through  time 
switching  technique,  and  the  remaining  power  was  stored  for 
future  usage.  The  outage  probability  of  the  proposed  strategy 
was  studied.  Besides,  a  similar  strategy  was  discussed  for 
user  equipment  relay  in  device-to-device  communications 
in  fl6|. 

Note  that,  it  is  still  an  open  problem  to  allow  energy 
accumulation  in  PS-based  relaying  strategy.  Furthermore, 
the  harvest-store-use  mode  is  well  justified  if  the  battery 
has  perfect  efficiency.  However,  nearly  all  practical  batteries 
suffer  a  storage  loss  to  varying  extents,  ranging  from  10%  to 
30%  ED.  From  the  perspective  of  energy  efficiency,  a  new 
harvest-use-store  model  has  been  proposed  in  ED,  where 
the  harvest  energy  is  prioritized  for  use  in  data  transmission 
while  its  balance/debt  is  stored  in  or  extracted  from  the 
battery,  which  avoids  unnecessary  energy  loss  in  storing.  This 
new  model  has  been  combined  with  the  conventional  energy 
harvesting  techniques  in  f]~8ll. 

B.  Motivations  and  Contributions 

To  realize  an  efficient  utilization  of  harvested  energy  and 
improve  spectral  efficiency,  energy  accumulation  is  realized 
in  PS-based  relaying  in  this  paper.  In  particular,  the  com¬ 
bining  of  PS  based-relaying  and  harvest-use-store  model 
to  improve  throughput  performance  is  considered.  Different 
from  the  previous  works  with  time  switching-based  case 
in  021,  ED,  information  transfer  and  battery  operations 
including  the  charging  and  discharging  happen  at  the  same 
time  slot  in  the  PS-based  case.  Moreover,  the  enhanced 
PS  technique  needs  to  be  designed  to  support  the  battery 
operations.  To  address  these  challenging  issues  is  the  main 
concern  of  this  paper,  and  the  main  contributions  can  be 
summarized  as  follows: 

«  To  realize  an  efficient  utilization  of  harvested  energy 
with  the  help  of  a  battery,  a  harvest-use-store  PS  relaying 
strategy  with  distributed  beamforming  is  proposed  for 
the  wireless-powered  multiple-relay  scenario.  Specifi¬ 
cally,  each  relay  obtains  both  information  and  power 
from  the  received  signals  transmitted  by  the  source 
node  via  power  splitting.  Subsequently,  with  the  help 
of  a  battery,  the  harvested  power  is  used  to  amplify- 
and-forward  the  information  to  the  destination  through 
distributed  beamforming  with  adaptive  power  allocation. 

•  To  reveal  a  theoretical  bound  of  the  proposed  strat¬ 
egy,  a  throughput  maximization  problem  is  formulated 
and  solved  with  an  ideal  non-causal  channel  state  in¬ 
formation  (CSI)  assumption.  Though  the  formulated 
optimization  problem  is  intractable  looking,  a  layered 
optimization  method  that  derives  the  optimal  solution  is 
developed.  In  particular,  the  joint  PS  and  battery  oper¬ 
ation  design  in  the  proposed  strategy  is  decomposed  in 
two  layers,  such  that  the  original  optimization  problem 


IEEE  JOURNAL  ON  SELECTED  AREAS  IN  COMMUNICATIONS,  VOL.  34,  NO.  X,  JAN.  2016 


3 


is  transformed  to  a  dynamic  programming  problem  with 
a  subproblem  requiring  optimization  embedded  in  it. 
Then,  to  address  the  non-convex  embedded  subproblem, 
an  alternating-Dinkelbach  optimization  is  proposed  to 
transfer  the  subproblem  to  a  convex  form.  Further,  the 
dynamic  programming  problem  can  be  solved  by  using 
backward  induction. 

«  To  study  the  throughput  performance  of  the  proposed 
harvest-use-store  PS  relaying  strategy  in  a  general  sce¬ 
nario  with  causal  CSI,  the  proposed  layered  optimization 
method  is  extended  by  utilizing  the  statistical  properties 
of  the  CSI  via  incorporating  a  finite-state  Markov  Chain 
model.  Further,  a  greedy  method  is  proposed,  which 
requires  no  information  about  subsequent  time  slots.  It’s 
shown  that  the  adopted  greedy  method  will  use  up  the 
harvested  energy  at  each  transmission.  Simulation  re¬ 
sults  reveal  that  the  advantages  of  the  proposed  strategy 
over  the  conventional  battery-free  ones  depend  on  the 
utilization  of  information  about  subsequent  time  slots. 

The  remainder  of  this  paper  is  outlined  as  follows.  In 
Section  II,  the  system  model  will  be  presented  and  the  pro¬ 
posed  strategy  will  be  described.  Following  that,  the  optimal 
joint  PS  and  battery  operation  design  will  be  developed  with 
the  non-causal  CSI  assumption  in  Section  III.  Then,  the 
causal  CSI  case  will  be  analyzed  in  Section  IV,  followed  by 
numerical  results  in  Section  V  and  conclusions  in  the  final 
section. 

II.  System  Model  and  Protocol  Description 

The  system  under  consideration  is  a  wireless-powered 
cooperative  relay  network  consisting  of  a  source  (S),  a  desti¬ 
nation  (D),  and  a  set  of  I\  energy  harvesting  relays  (Rk,  k  = 
1,2,  •••  ,K),  where  each  relay  is  equipped  with  a  battery. 
No  direct  link  exists  between  S  and  D.  Considering  a  half¬ 
duplex  relay  model  and  a  total  of  T  equal  length  time  slots. 
A  harvest-use-store  PS  strategy  is  proposed  and  implemented 
at  each  time  slot,  each  of  which  consists  of  two  equal  length 
phases.  In  the  first  phase,  each  relay  obtains  both  information 
and  power  from  its  received  RF  signal  transmitted  by  S 
via  power  splitting.  In  the  second  phase,  with  the  help  of 
battery,  the  harvested  energy  is  used  to  amplify-and-forward 
the  information  to  D  through  distributed  beamforming  with 
adaptive  power  allocation. 

A.  Simultaneous  Information  and  Power  Transfer  via  Power 
Splitting 

In  the  first  phase  of  time  slot  t,  the  received  RF  signal  at 
the  Rk  can  be  expressed  by 

Vk  (f)  =  hk  ( t )  s/ps  (t)xs  ( t )  +  zl  (t) ,  (1) 

where  t  denotes  the  time  index,  and  k  is  the  index  for  relays, 
hk(t)  denotes  the  complex  link  gain  between  S  and  Rk, 
xs(t )  denotes  the  signal  transmitted  from  S  during  time  slot 
t  with  normalized  power  E(\xs(t)\~)  =  1,  and  ps{t)  is  the 


transmit  power  at  S  with  ps(t)  <  P.  Finally,  z%(t)  is  the  ad¬ 
ditive  zero  mean  variance  a %  white  Gaussian  noise  (AWGN) 
in  the  received  signal  ifTTI.  thus  z%(t)  ~  CN  (0,  crjj) . 


Fig.  1.  Processing  at  the  energy  harvesting  relay  R &  during  the  t-th  time 
slot. 

The  received  RF  signal  yk  ( t )  is  then  divided  into  three 
different  power  streams  through  PS,  which  is  depicted  in 
Fig.  \T\  Specifically,  the  first  part  of  the  received  signal, 
which  uses  a  PS  ratio  A k,i(t)  G  [0, 1],  is  down-converted  to 
baseband  and  will  be  amplified  and  transmitted  in  the  next 
phase.  The  sampled  baseband  signal  is  given  by 

Xk,R{t)  =  \Jxkj(t)  (hk  (t)  Vps  ( t)xs  (t)  +  zl  (t))+zbk(t), 

(2) 

where  zjj(t)  ~  CN  (0,<r„)  is  the  baseband  equivalent  noise 
of  the  pass  band  noise  S£(i),  and  z\(t)  ~  CN  (0,cr^)  is  the 
sampled  AWGN  introduced  by  RF  band  to  baseband  signal 
conversion  ED-  The  second  part  of  signal  power,  which  uses 
a  PS  ratio  A k,F(t)  €  [0, 1  —  Xk,i{t)],  will  be  used  to  power 
the  amplify-and-forward  process  in  the  next  phase.  This  part 
of  power  is  described  by 

Pk,F{t)  =m^k,F{t)(\hk{t)\2ps{t)  +  <rl^,  (3) 

where  ip  G  (0, 1]  is  a  constant  denoting  the  energy  con¬ 
version  efficiency  from  signal  power  to  DC  power.  The  last 
part  of  signal  power,  which  uses  a  PS  ratio  Aa,  ,s(f)  = 

(1  —  A k,i(t)  —  A k,F(t)),  will  be  used  to  charge  the  battery. 

This  part  of  power  is  denoted  by 

Pk,B'(t)  =  T)iXk,B  (t)  (\hk  (t)\2 PS  ( t )  +  (4) 

A  finite  and  discrete  battery  model  is  adopted  in  this 
paper,  where  the  battery  is  of  size  Bmax  =  aP,  ( a  > 

0)  and  is  discretized  into  L  +  1  energy  levels  T  = 

{0,  Bmax/L ,  •  •  •  ,  Bmax}.  Note  that,  this  model  can  closely 
approximate  a  continuous  battery  model  when  the  number  of 
energy  levels  is  sufficiently  large  |[22l.  Due  to  this  finite  and 
discrete  battery  model,  the  charged  power  at  Rk  during  time 
slot  t  is  given  by 

Pk,B(t)  =  Tnin^Bmax  Bk  (t'j  ,  y  Bmaxf 

n*(t )  =  arg  max  {n(t)  :  Bmax  <  ipPk,B' (t)}, 

n(t)e{0,---  ,L} 

(5) 

where  Bk(t)  G  T  denotes  the  energy  level  at  the  beginning 
of  time  slot  t,  772  €E  (0, 1]  is  the  storage  efficiency  describing 
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the  power  loss  in  battery  charging  Q,  the  first  equation  is 
due  to  the  finite  property  of  the  battery,  and  the  second  one 
is  due  to  the  discrete  property  of  the  battery. 

B.  Distributed  Beamforming  with  Adaptive  Power  Allocation 

In  the  second  phase,  the  destination  D  receives  a  signal 
transmitted  from  all  relays  as 

K 

y(t)  =  Pk  (t)  9k  (t)  eP0k^XktR(t)  +  z(t),  (6) 

k=  1 

where  gk{t)  denotes  the  link  gain  for  Rk  to  D  and  z(t )  ~ 
CN  (0,  <t|j)  is  the  AWGN  at  D,  eJ9k  ^  is  derived  from  the 
distributed  beamforming  design  EH-  where 

Ok  (t )  =  -  (arg  hk  (t )  +  arg  gk  (f ) ) ,  (7) 

which  cancels  the  phases  of  the  two  links  between  Rk  and 
S,  D  respectively,  fdkif)  is  the  amplification  gain  depicted  by 

&(t)  =  - 7 - - t - ,  (8) 

\J  \k,i{t)[\hk  (f)|  ps(t)+a2a J 

where  pk,R(t )  is  the  transmit  power  at  Rk,  this  power  is 
composed  of  two  different  parts: 

Pk,R{t )  =  Pk,F(t )  +  bklF  ( t )  ,  (9) 

where  pk.F  is  provided  by  the  PS  operation  in  ©,  and 
bk,F(t)  6  T  is  provided  by  the  battery  discharging.  As  a 
result,  the  energy  level  at  Rk  at  the  end  of  time  slot  t  (which 
is  also  the  beginning  of  time  slot  (t  +  1))  is 

Bk(t  +  1)  =  Bk  (t)  +pk,B{t)  —  bk,F  (t)  ■  (10) 


overall  throughput,  including  information  transfer  (i.e., 
designing  Afc  j(t),  Vfc,  Vi),  power  transfer  (i.e.,  design¬ 
ing  Vfc,  Vi),  battery  charging  (i.e.,  designing 

Afe,s(f),VA:,Vf),  and  battery  discharging  (i.e.,  designing 
bk,F(t),  Vfc,  Vf).  The  first  three  factors  are  mainly  deter¬ 
mined  by  the  PS,  while  the  last  one  is  only  determined  by 
the  battery  operation.  Moreover,  the  harvested  energy  (i.e., 
Pk,B(t),  Vfc,  Vf)  obtained  from  the  PS  affects  the  throughput 
during  the  subsequent  time  slots  via  battery  charging  and 
discharging.  To  this  end,  the  PS  and  battery  operations  are 
coupled  with  distributed  beamforming,  which  makes  the 
throughput  maximization  problem  intractable. 

III.  Optimal  Joint  Power  Splitting  and  Battery 
Operation  Design 

In  this  section,  the  proposed  harvest-use-store  PS  relaying 
strategy  is  optimized  toward  the  goal  of  throughput  maxi¬ 
mization  with  the  CSI  of  T  time  slots  known  before  transmis¬ 
sion,  which  is  commonly  known  as  non-causal  CSI  assump¬ 
tion.  To  solve  the  resulting  intractable  looking  problem,  the 
throughput  maximization  problem  is  described  by  a  dynamic 
programming  problem  equivalently.  However,  such  a  solution 
is  computationally  prohibitive  to  implement  since  the  number 
of  possible  joint  PS  and  battery  operations  to  be  evaluated 
is  infinite.  Therefore,  the  joint  PS  and  battery  operations 
are  decomposed  equivalently,  and  the  resulting  embedded 
problem  and  overall  problem  are  formulated.  Moreover,  the 
optimal  solutions  to  the  embedded  and  overall  problems  can 
be  approached  through  iterative  algorithms  and  backward 
induction.  In  this  way,  the  optimal  solution  toward  throughput 
maximization  is  approached. 


Note  that,  any  power  consumption  at  the  relays  for  pur¬ 
poses  other  than  for  transmission  is  assumed  negligible  112. 
03.  Due  to  the  fact  that  the  antenna  noise  z",  (f)  has  a 
negligible  impact  on  both  the  information  processing  and 
energy  harvesting  E2,  it  is  ignored  in  the  following  analysis 
by  setting  ol  =  0.  Substituting  0  and  0  into  0.  the 
received  signal-to-noise-ratio  (SNR)  at  D  at  time  slot  t  can 
be  expressed  as 

PS  it)  (  E  Pk{t)\hk{t)gk  (t)\s/\k,i  (t) 

SNR(t)  = - ^ - 

E  Pk(t)2\gk  ( t)\2ol  +  o2D 

fc= l 

The  overall  system  throughput  after  T  transmissions  is 
described  by 

1  T 

Rtotal  =  -J2l°g(1  +  SNR(t))’  (12) 

Z  t=l 

where  the  factor  1/2  is  due  to  the  half-duplex  relaying  mode. 

According  to  ®.  @,  CD.  and  03.  four  key 

factors  should  be  jointly  considered  to  optimize  the 


A.  Decoupling  of  the  Throughput  Maximization  Problem 
We  first  simplify  the  formulation  of  the  throughput  maxi¬ 
mization  problem  by  utilizing  insights  about  the  optimization 
variables.  According  to  (ITTb  and  (IT2l).  the  throughput  Rtotai 
is  a  strictly  increasing  function  of  ps(f),Vf,  when  other 
variables  are  fixed.  Since  ps(t)  <  P,  the  optimum  choice 
is 

Ps(t)  =  P.Vt  (13) 

Next,  the  optimized  A fc,s(f)  in  ®  and  ®  should  satisfy 

77/*  ( t ^ 

Pk,Br{t)  =  pk,B(t )  =  -J^-Bmax,  Vt.Vfc  (14) 

Note  that  the  throughput  Rtotai  can  generally  be  further 
improved  by  reassigning  the  PS  ratios.  A fe,s(f)  is  decreased 
to  meet  d,  A k,F(t )  remains  the  same,  and  A k,i{t)  is 
increased,  which  can  be  seen  according  to  0,  ©,  ®. 
and  (fTTl).  The  result  in  (IT4l)  implies  that  no  split  signal  power 
for  battery  charging  would  be  discarded  due  to  the  finite 
and  discrete  property  of  the  battery,  because  the  discarded 
part  could  have  been  assigned  for  information  processing  and 
improve  the  transmission  performance  via  the  more  proper 
PS  ratios. 
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To  make  the  analysis  more  concise,  the  energy  level 
variation  is  introduced  to  describe  the  effects  of  battery 
charging  and  discharging,  which  is  denoted  by  vk  (t)  at  Rk 
during  time  slot  t,  such  that 


vk  (t)  =  Bk(t )  -Bk(t+  1) 

-  h,F  ( t )  -  r)ui2\k,B  (t)  P\hk  (t) |2, 


(15) 


where  the  equation  (a)  is  derived  from  ©,  (1 1  011  and  (1 1 4k 
To  avoid  unnecessary  power  loss  caused  by  battery  charg¬ 
ing  (due  to  772),  it’s  assumed  that  at  each  battery,  the  charging 
and  discharging  operations  would  not  occur  during  the  same 
time  slot.  In  particular,  if  the  energy  level  at  Rk  decreases 
at  time  slot  t  (i.e.,  vk(t)  >  0),  the  battery  is  not  charged 
in  the  first  phase  (i.e.,  A k,B  (t)  =  0),  and  discharged  in  the 
second  one,  which  implies  that  the  stored  battery  power  is 
utilized  to  back  up  the  transmit  power.  On  the  other  hand, 
if  the  energy  level  increases  (i.e.,  vk(t)  <  0),  the  battery  is 
charged  in  the  first  phase,  and  not  discharged  in  the  second 
one  (i.e.,  bk,F  (t)  =  0),  which  implies  that  a  part  of  the 
harvested  power  is  stored  for  future  usage.  Based  on  (fTSt. 
these  relationships  can  be  described  by 


(16) 


bk,F  (t)  =  max  {0,  vk  (t)}  , 

\k,B(t)  =  - min{0,  ■ 

Substituting  (j3j.  (fl3l>  and  (IT6l>  into  (J9]»,  the  transmit  power 
for  Rk  at  time  slot  t  is  rewritten  as 

Pk,R  (t)  =  m  (1  -  A kJ  (t))  P\hk  (t) \2 

T  mill  {0,  }  T  max  {0,  vk  (t)}  ■  ^ 

As  a  result,  the  throughput  maximization  problem  can  be 
formulated  as 

(PI)  :  max  Rtotai 

I  (£),v(£),V£ 


Vk(t) 


r},  Vfc, 


Vi 


2=1 


t- 1 


<  Bk  (1)  -  X)  vk  (*),  Vk,Vt 
»= 1 

(18) 

where  1(f)  =  (\i}j(t),  ■  ■  ■  ,\xj(t))  describes  the  in¬ 

formation  transfer  design  at  time  slot  f  and  v(f)  = 
(zti  ((),•••  ,vk  (f))  denotes  the  energy  level  variation  de¬ 
sign.  The  first  constraint  Cl  is  derived  from  the  definitions 
of  PS  rations  and  the  constraint  in  (IT6l).  which  implies  that 
the  range  of  allowable  information  transfer  designs  changes 
with  the  energy  level  variation  designs.  The  second  constraint 
C 2  is  derived  from  the  definition  of  Bk{t)  and  vk(t),  which 
is  due  to  the  finite  and  discrete  property  of  the  battery.  The 
third  constraint  C 3  is  derived  from  (fl~5l>.  which  implies  that 
design  of  the  energy  level  variation  is  coupled  over  time. 
Since  allowable  values  for  Vk(t),  Vk,  Vf  are  discrete.  Problem 
(PI)  is  a  mixed  integer  optimization  problem  and  non-convex. 


To  handle  this  time  coupled  joint  PS  and  battery  operation 
design  in  Problem  (PI),  the  following  components  are  defined 
to  transform  Problem  (PI)  into  a  dynamic  programming 
problem  equivalently. 

1)  battery  state:  The  battery  state  describes  the  energy  levels 
at  all  batteries  at  the  beginning  of  each  transmission, 
which  is  described  by  S(f )  =  {Bi  (f) ,  •  •  •  ,  B k  (f)}  at 
time  slot  t.  Without  loss  of  generality,  the  initial  battery 
state  is  set  as  S(l)  =  {0,  •  •  •  ,0}. 

2)  decision:  The  decision  denotes  the  energy  level  variation 
designs  and  the  information  transfer  designs  at  all  relays, 
which  is  denoted  by  D(f)  =  (v(f),I(f))  at  time  slot  t. 

3)  state  evolution:  The  state  evolution  describes  the  change 
of  battery  states  over  two  adjacent  time  slots,  which  can 
be  denoted  by 

S(i)  —  S(f  +  1)  =  v  (f) ,  (19) 

at  time  slot  f.  can  be  interpreted  as  Bk  ( t )  — 

Bk  (f  +  1)  =  vk(t),Vk,  which  is  derived  from  (fl5T>. 

Note  that,  the  energy  level  variation  design  results  in  the 
state  evolution.  Moreover,  the  range  of  allowable  energy  level 
variation  designs  is  determined  by  battery  states 


Bk  ( t )  -  Bmax  —  vk  ( t )  <  Bk  (t) ,  Vfc 


(20) 


s.t.Cl  :  0  <  A kJ  (i)  <  1  +  min  {o,  mmP{hk(t)^ 

C2  :  vk  (t)  =  €  {-L,---  ,  0,  -  -  -  ,L},Vfe,Vt 

C3  :  Bk  (1)  -  J2  vk  (*)  -  Bmax  <  Vk  (f) 


which  is  derived  from  (fl5l>  and  the  constraint  C'3  in  Problem 
(PI). 

4)  payoff:  The  payoff  is  the  throughput  at  each  time  slot. 
According  to  (HD.  the  payoff  is  determined  by  the 
battery  state  S(f),  the  decision  D(f)  and  the  time  slot 
t,  which  can  be  described  by 

R(S(t),  D(i),  t)  =  hog  (1  +  SNR  (t)).  (21) 

5)  summation  of  payoffs:  The  summation  of  payoffs  start¬ 
ing  from  time  slot  T  and  summing  backward  to  the  first 
time  slot  (to  facilitate  backward  induction)  is  defined  as 

f/(S(f),D(f),f)=i?(S(f),D(f),f) 

+U(S{t  +  1),  D(f  +  1),  (t  +  1)),  V(t  ^  T) 
£/(S(T),  D(T),  T)  =  R(S(T),  D(T),  T). 

(22) 

In  this  way,  the  throughput  in  Problem  (PI)  is  the  sum¬ 
mation  of  payoffs  through  T  time  slots  (i.e.,  Rtotai  = 
U  (S  (1) ,  D(l),  1)).  Based  on  equation  (l22l)  and  Bellman 
equation  ll25l.  the  optimality  equations  to  solve  Problem  (PI) 
can  be  written  as 

U*(S(t),t)  =  max  R(S{t),v(t),  1(f), t) 

+U*(S(t  +  1),  (f  +  1)),  Vf  /  T  (23) 
s.t.Cl,  (72,  (73. 

U*(S(T),T)  =  max  fl(S(T),v(T),I(T),T), 

v(T),I(T)  (24) 

s.t.Cl,  (72,(73, 

as  in  many  typical  dynamic  programming  problems,  we  can 
work  on  one  time  slot  at  a  time,  starting  from  the  last 
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one  (i.e.,  U*(S(T),T))  and  working  backward  (i.e.,  until 
U*( S(l),  1)),  using  what  is  called  backward  induction  l25l. 
As  a  result,  the  optimal  set  of  decisions  can  be  found  by  the 
backward  induction  algorithm  which  essentially  evaluates  all 
possible  sets  of  decisions  and  eventually  picks  the  besfl 

B.  Decomposing  of  the  Joint  PS  and  Battery  Operation 


C.  Optimizing  of  the  Embedded  Problem  and  the  Overall 
Problem 

We  first  solve  the  embedded  problem  in  (l25l>  with  each 
allowable  battery  operation  v(f).  Since  the  objective  function 
in  03  is  monotonically  increasing  in  terms  of  SNR(t), 
this  payoff  maximization  problem  is  clearly  equivalent  to  the 
SNR  maximization  problem  111 91.  which  is  described  by 


Though  the  optimal  solution  to  Problem  (PI)  can  be 
obtained  through  dynamic  programming,  such  a  solution 
is  computationally  prohibitive  to  implement  because  the 
PS  ratios  1(f)  in  each  decision  are  continuous  variables, 
which  makes  the  number  of  possible  decisions  infinite.  To 
overcome  this  computationally  prohibitive  issue,  all  possible 
decisions  are  evaluated  in  a  decomposed  manner,  and  thus  the 
optimization  problems  in  (l23l>  and  (l24t  are  solved  through 
a  two  step  procedure.  The  basic  idea  is  that  we  optimize 
R(S (f),  v(f),  1(f),  f)  over  /(f)  for  each  of  the  possible  values 
of  v(t)  in  (23)  and  (24),  where  these  optimized  results  are 
denoted  by  I*(v(£)).  Since  we  have  the  optimum  I*(v(£)) 
for  each  of  the  possible  values  of  v(£),  we  can  use  this 
to  eliminate  1(f)  in  (23)  and  (24).  Correspondingly,  the 
formulated  subproblem  in  the  first  step  is  given  by 

max  i?(S(f),  v(f),  1(f),  f) 

ip) 

=  max  ilog  (1  +  SNR  (£)),  (25) 

Al  ,/(*),■■■ 

S.t.  Cl. 

which  is  derived  from  (123b.  (1241  and  d2TTl.  and  is  called  the 
embedded  problem  in  the  following.  Note  that,  the  embedded 
problem  needs  to  be  solved  for  each  possible  v(£)  with  a 
certain  (S (£),£). 

In  the  second  step,  by  substituting  the  optimal  solutions 
I*(v(£))  to  (123b  and  (l24i).  this  dynamic  programming  prob¬ 
lem  can  be  simplified  to 


max  SNR  ( t ) 

Al 

s.t.  0<Afc;/(f)  <l  +  min{0 

(28) 

By  substituting  ([8}  (ITU  and  (fT7t  in  (128b  and  omitting 
Afc,/(f),  hk(t),  gk{t)  and  vk{tfs  dependence  on  f  since  t 
is  constant  during  the  optimization,  this  SNR  maximization 
problem  can  be  rewritten  as 

(P 2)  :  max  J(x i,--  -  ,Xk) 

Xl,—  ,XK 

^  m\9k\2(ak-xk)(^--^^ 

12  mlgk^lak-x^^y+crjj 
k= 1 

s-t.ol  <  xk  <  (l  +  min  {0,  P\hk\2  +  ^2,Vfc 

(29) 

where  ak  =  P\hk\2  +  max  jo,  ^  j  +  min  jo,  ^  j  +  of, 
and  xk  =  XkjP\h.k\2  +  of. 

A  key  challenge  in  solving  Problem  (P2)  is  the  lack 
of  convexity  in  the  problem  formulation,  thus  alternating 
optimization  |f20l  and  Dinkelbach  algorithm  |[23l  are  utilized 
jointly  to  transfer  Problem  (P2)  into  a  convex  form,  which  is 
called  alternating-Dinkelbach  for  short  in  this  paper.  First,  the 
K  variables  (i.e.,  xi,  •  •  •  ,Xk)  in  Problem  (P2)  are  handled 
by  alternating  optimization,  where  one  variable  is  updated  at 
a  time  while  fixing  the  others.  Specifically,  the  decomposed 
subproblem  to  optimize  Xj  can  be  given  by 


U*  (S  (£),£)  =  max{P(S(f),  v(£),F(v(t)),£) 

v(i) 

+U*  (S  (f  +  1),  (f  +  1))}  ,  V(£  T)  (26) 

s.t.  (72,(23. 

U*  (S  (T),  T)  =  maxi?(S(T),v(T),I*(v(T)),T), 

v«  (27) 

s.t.  (72,(73. 

Note  that,  since  each  possible  v(£)  are  evaluated  with  the  cor¬ 
responding  I*(v(f)),  the  number  of  decisions  to  be  evaluated 
is  (L  +  1)k  for  each  state  S(£)  at  each  time  slot  t.  To  find  the 
optimal  set  of  decisions  in  (f26l>  and  (l27b  is  called  the  overall 
problem  in  the  following.  In  this  way,  the  joint  PS  and  battery 
operation  is  decomposed,  and  Problem  (PI)  is  transformed 
to  a  dynamic  programming  problem  (overall  problem)  and 
an  subproblem  requiring  optimization  (embedded  problem). 


(P3)  :  “f  W 

s.t.  a2  <  Xj  <  (l  +  min  {o,  mri^\h,f})  P\hj\ 2  +  cr2, 

(30) 

where  F1(xj)  =  (^jrp  \g3 12  (0j  -  Xj)  (l  -  J) 

+  E  \Jvi\9k\2  (tife  -  Xk)  (i  -  §^)  j  >  o,  F2(xj)  = 

m\9j\2  («J  -  Xj)  ^7  +  E  77i lf?fc |2  («fc  -  Xk)  ^  +  <J2d  >  o. 
kjtj 

Then  the  objective  function  in  Problem  (P3)  is  transferred 
from  the  fractional  form  to  the  subtractive  form  via  Dinkel¬ 
bach  algorithm  ll23l.  In  particular,  defining  a  parameter  q  as 


=  Fi{xj) 

Fi(xjY 


(31) 


lrThe  Viterbi  algorithm  can  be  viewed  as  dynamic  programming 


we  formulate  a  subtractive  form  optimization  problem  with 
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a  given  parameter  q  as 


(P4)  :  F(q)  =  max  F1(xj)  -  qF2{xj) 

Xj 

s.t.  o2  <xk<(l  +  min  {o,  P\hk\2  +  a2. 

(32) 

To  solve  Problem  (P3),  we  first  present  the  following 
lemma. 

Lemma  1: 


q 


f 


Fiix'j)  Fi(Xj) 

~  =  max  — -  ~  -- 

F2(x'j)  xj  F2{Xj) 


if,  and  only  if, 

F(q')  =  max  Fj(xj)  -  q'F2(xj) 
=  F1  (xp  -  q'F2  (x'j )  =  0. 


(33) 


(34) 


Proof:  Lemma  1  can  be  proved  by  following  a  similar 
approach  as  in  l23l.  □ 

Lemma  1  reveals  that  to  solve  Problem  (P3)  with  an  objec¬ 
tive  function  in  fractional  form,  there  exists  a  corresponding 
Problem  (P4)  in  subtractive  form.  Moreover,  Lemma  1  pro¬ 
vides  the  condition  about  when  the  two  problem  formulations 
lead  to  the  same  optimal  solution  a;'.  To  reach  the  condition 
in  (E§,  we  focus  on  Problem  (P4)  first. 

Lemma  2:  The  optimization  objective  in  Problem  (P4)  is 
a  concave  function  in  terms  of  Xj. 

Proof:  Please  refer  to  Appendix  A.  □ 

According  to  Lemma  2  and  the  fact  that  the  feasible  set 
for  Xj  is  convex,  which  can  be  seen  in  (l32l>.  Problem  (P4)  is 
a  convex  optimization  problem.  However,  the  complicated 
expression  of  F(q)  makes  it  difficult  to  derive  a  closed- 
form  solution  through  Karush-Kuhn-Tucker  (KKT)  condi¬ 
tions,  thus  a  numerical  method  (i.e.,  bisection  method  11241 ) 
is  adopted  here  to  handle  Problem  (P4). 

Then,  we  follow  a  similar  approach  as  in  f23l  (known 
as  the  Dinkelbach  algorithm)  to  derive  the  solution  that 
satisfies  (l34l>.  which  is  summarized  in  Algorithm  1. 

Lemma  3:  Algorithm  1  converges  to  x'j  and  q' ,  which 
satisfy  (l34t  in  Lemma  1. 

Proof:  For  the  purpose  of  explanation,  at  the  n-th  itera¬ 
tion,  denote  the  parameter  as  qn,  and  the  optimized  solution 
to  Problem  (P4)  as  x'j .  According  to  (13 1  b.  the  parameter  is 

F\(x ")  . 

updated  by  qn+i  =  f2(x")  next  iterati°n-  Assume  that 

the  iteration  process  will  not  be  terminated  in  the  (n  +  l)-th 
iteration. 

First,  it’s  shown  that,  the  optimized  value  F(q)  in  Problem 
(P4)  is  non-negative. 


F  (qn+i)  =  max  (xj)  -  qn+\F2  (xj) 

xi  (35) 

>  Fi  (xp  -  qn+iF2  (xp  =  0. 

Next,  it’s  revealed  that  the  parameter  q  increases  after 
each  iteration.  According  to  Lemma  1  and  (l35ll.  since  the 
iteration  process  is  not  terminated  in  the  (n  +  l)-th  iteration, 


Algorithm  1  Information  Transfer  Design  Optimization  at 
Relay  Rj 

1:  Input:  Fixed  values  of  xkl\/(k  ^  j). 

2:  Initialization: 

3:  Set  the  parameter  as  q  =  0,  the  index  of  iteration  as 
n\  =  0,  and  the  judgement  of  convergence  as  conv\  =  0, 

4:  Set  the  maximum  number  of  iterations  as  and  the 

threshold  of  termination  as  Ai,  which  is  a  constant  that 
approaches  0. 

5:  Repeat: 

6:  Set  ni  =  n\  +  1, 

7:  Solve  Problem  (P4)  through  bisection  method  l24l.  and 
mark  the  optimal  solution  as  xp , 

8:  IfF1(a:”1)-(?F2(a:”1)<A1 

9:  Mark  the  optimal  solution  as  x'j  =  xp ,  and  set 

conv  i  =  1. 

10:  else 

ll:  Update  q  according  to  (OH. 

12:  Until:  conv i  =  1  or  m  =  nj„al. 

13:  Return:  The  optimized  information  transfer  design  for 

Rj  as  x'j. 


F  ( qn )  >  0  and  F  (qn+i)  >  0  must  be  true.  Thus,  F  ( qn ) 
can  be  expressed  as 


(qn)  =  Fi  (xp  -  qnF2  (xp 
{qn+i  ~  qn)  F2  (xp  >  0 


since  F2  {xp  >  0,  qn+i  >  qn  must  be  true. 

Further,  it’s  shown  that  F  (q)  decreases  after  each  iteration. 
Due  to  the  increasing  of  q  after  each  iteration,  F  (q-n )  can  be 
expressed  as 


F  ( qn )  =  max  Fx  (xj)  -  qnF2  (xj) 

>  F1  (x]+1)  -  qnF2  (xp1) 

>  F1  (x]+1)  -  qn+1F2  (xp1) 

=  F{qn+ 1) . 


(37) 


Based  on  the  properties  in  OH  and  Q71>.  after  a  large 
enough  number  of  iterations,  F  (q)  finally  approaches  to  0. 
As  a  result.  Algorithm  1  converges  to  x'j  and  q' ,  which 
satisfy  (|34]>  in  Lemma  1  and  x'j  is  the  optimal  solution  to 
Problem  (P3).  □ 

Note  that,  the  decomposed  subproblems  to  optimize 
x,j ,  j  £  {1,  ■  •  •  ,  K }  have  the  same  problem  formulation  as 
Problem  (P3)  except  that  the  index  number  j  is  different,  thus 
they  can  be  solved  using  Algorithm  1.  As  a  result.  Problem 
(P2)  is  solved  using  the  proposed  alternating-Dinkelbach 
optimization,  and  the  corresponding  procedure  is  summarized 
in  Algorithm  2. 

Lemma  4:  The  proposed  Algorithm  2  converges  to  the 
optimal  solution  to  Problem  (P2). 

Proof:  Please  refer  to  Appendix  B.  □ 
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Algorithm  2  Embedded  Problem  Optimization 
l:  Input:  A  battery  state  S(f)  and  a  energy  level  variation 
design  v. 

2:  Initialization: 

3:  Set  the  index  of  iteration  as  712  =  0,  and  the  judgement 
of  convergence  as  conv 2  =  0, 

4:  Set  the  maximum  number  of  iterations  as  n^laa,,  and  the 
threshold  of  termination  as  A2,  which  is  a  constant  that 
approaches  0, 

5:  Set  the  optimized  objective  value  of  Problem  (P2)  as 
J*  =  0. 

6:  Repeat: 

7:  Set  ri2  =  «2  +  1,  calculate  the  index  of  the  decomposed 
subproblem  as  j  =  n2  —  |_if  J  *  where  [J  will  round 
down. 

8:  Derive  2'  using  Algorithm  1  with  the  fixed  values  of 

2fe(t),V(fc  ^  j ),  and  calculate  J{x i,--  -  ,2'-,---  ,xk) 

in  <03. 

9:  If  J(xi,  ■  ■  ■  ,Xj,  ■  ■  ■  ,Xk)  —  J*  <  A 
10:  Set  conv 2  =  1,  and  mark  the  optimal  solution  as 

[£*,•••  ,2^]. 

11:  Calculate  A^j,Vfc  according  to  the  definition  Xk  = 

Afc,/P|/ife|2  +  of. 

12:  else 

13:  Update  n2  =  n-2  +  1,  and  set  J*  = 

J{xi,  ■■■  ,2',  ■  ■  ■  ,xK )• 

14:  Until:  conv 2  =  1  or  n2  =  nj,ra. 

15:  Return:  The  optimal  embedded  information  transfer 
design  I*  =  (Af  j,  ■  ■  ■  ,  Aj f  j). 


When  the  optimal  solution  to  the  embedded  problem  is 
obtained  through  Algorithm  2,  the  overall  problem  can  be 
solve.  In  particular,  the  stored  power  at  batteries  should  be 
used  up  at  the  last  time  slot  or  they  will  be  wasted,  which  is 
described  by  v*(T)  =  S(T).  Thus,  (l27l>  can  be  solved  as 

U*  (S  (T),  T)  =  R(S(T),  S(T),  F (S(T)),  T).  (38) 

Then,  optimality  equations  in  (l26l>  can  be  solved  through 
backward  induction  algorithm  |[25l.  In  conclusion,  the  cor¬ 
responding  procedure  to  solve  Problem  (PI)  is  described 
in  Algorithm  3.  Note  that,  though  the  performance  of 
the  optimal  joint  PS  and  battery  operation  design  derived 
from  Algorithm  3  cannot  generally  be  achieved  in  practice 
because  it  requires  non-causal  CSI,  a  theoretical  upper  bound 
on  performance  of  the  proposed  harvest-use-store  PS  relaying 
strategy  is  provided. 

IV.  Optimized  Ioint  Power  Splitting  and  Battery 
Operation  Design  with  Causal  Channel  State 
Information 

In  the  previous  section,  the  optimal  joint  PS  and  battery 
operation  design  is  derived  with  the  knowledge  of  non-causal 
CSI,  which  reveals  a  theoretical  bound  for  the  proposed 


Algorithm  3  Joint  Power  Splitting  and  Battery  Operation 

Optimization  with  Non-causal  CSI 

1:  Input:  Non-causal  channel  state  information  (i.e., 

hk(t),gk(t),\/kyt). 

2:  Initialization: 

3:  Set  the  time  index  as  t  =  T\ 

4:  for  all  allowable  battery  state  S(T) 

5:  Deduce  I*(S(T))  for  the  state  evolution  v(T)  =  S (T) 
according  to  Algorithm  2, 

6:  Deduce  the  optimal  summation  of  payoffs  U*  (S (T)) 

according  to  (f38l). 

7:  Record  the  optimized  sets  of  decisions  as  L(T)|s(t)  = 

{(S(T),P(S(T)))}. 

8:  Repeat: 

9:  t  =  t-  1; 

10:  for  all  allowable  battery  state  S(t) 

11:  for  all  allowable  state  evolution  v(t) 

12:  Deduce  I*(v(f))  for  the  v(f)  according  to  Algorithm 

2, 

13:  Find  U*  (S  (t  +  1))  with  S  (t  +  1)  =  S  (t)  —  v(t)  in 

the  last  repetition. 

14:  Deduce  the  optimal  summation  of  payoffs  U*  (S(t)) 

according  to  (l26l>. 

15:  Mark  the  optimal  decision  as  (v*(f),I*(v*(f)))  = 

arg(7*  (S  (t))  according  to  (l26l>. 

16:  Record  the  optimized  sets  of  decisions  as  L(t)  |s(*)  = 
{(v*(i),I*(v*(t))),L(t  +  1)}. 

17:  Until:  t  =  1. 

18:  Return:  The  optimal  set  of  decisions  L(l)|g(1). 


harvest-use-store  PS  relaying  strategy.  Considering  that  the 
non-causal  CSI  is  hard  to  obtain  in  practice,  two  optimized 
joint  PS  and  battery  operation  designs  are  presented  in  this 
section  when  the  CSI  is  known  only  causally.  First,  a  Finite- 
state  Markov  Chain  model  approach  is  proposed,  which 
extends  the  proposed  layered  optimization  method  by  making 
use  of  the  statistical  properties  of  the  CSI.  Then,  a  greedy 
method  approach  is  designed,  which  requires  no  information 
about  the  CSI  during  the  subsequent  time  slots. 

A.  A  Finite-state  Markov  Chain  Model  Approach 

In  the  proposed  layered  optimization  method,  both  the 
information  transfer  design  and  the  energy  level  variation 
design  need  to  be  optimized.  Note  that,  when  the  CSI  is 
known  only  causally,  the  information  transfer  design  needs 
to  be  optimized  at  the  beginning  of  each  time  slot,  while 
the  energy  level  variation  design  needs  to  be  optimized 
before  transmission  because  it  is  coupled  over  time.  However, 
the  optimized  information  transfer  design  is  required  in  the 
embedded  problem  before  transmission.  To  ensure  this,  the 
optimization  of  the  joint  PS  and  battery  level  variation  design 
with  causal  CSI  is  carried  out  in  two  steps.  Before  trans¬ 
mission,  similar  to  the  non-causal  case,  both  the  embedded 
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information  transfer  design  and  the  overall  energy  level 
variation  design  are  optimized  with  the  modeled  link  gains. 
Then,  at  the  beginning  of  each  transmission,  the  information 
transfer  design  is  updated  and  optimized  with  accurate  link 
gains  obtained  from  the  causal  CSI. 

Before  transmission,  an  m-state  First-order  Markov  chain 
model  ED  is  adopted  to  represent  link  gains  using  the 
knowledge  of  the  distributions  of  channels,  which  has  been 
proved  to  be  an  accurate  model  for  slow  fading  channels 
in  (271 .  As  a  result,  the  continuous  variable  representing  the 
amplitude  of  each  link  gain  is  quantized  by  mapping  it  to 
to  non-overlapping  intervals,  each  of  which  is  called  a  state. 
Different  methods  to  obtain  the  boundary  values  of  these 
intervals  have  been  concluded  in  (27),  and  the  equal  probable 
steady  state  method  is  adopted  in  this  paper,  where  the  steady 
state  probabilities  of  the  next  states  are  equal.  Denoting  the 
sets  of  quantized  link  gains  as  (s.k  =  {ft},  ft},  ■  ■  ■  ft.™}  for 
ftfe(f),Vf  and  C k,D  =  {<?{,<?},  ■■■ 9 for  9k(t),V  t,  the  link 
gains  in  Problem  (PI)  are  replaced  by 

\hk(t)\  =  \<;s,k(t)\,\/k,Vt 
\9k{t)\  =  kfe,n(f)|,Vfc,Vf 

where  |?s,fc(f)|  G  (s,k  and  kfc,.o(£)|  G  (k,D-  Further,  to  give  a 
distinct  demonstration  of  the  finite-state  Markov  chain  model 
approach,  a  block  fading  channel  model  is  considered  in  this 
paper,  i.e.,  values  of  link  gains  at  different  time  slots  are  in¬ 
dependent.  Note  that,  the  correlated  channel  models  can  also 
be  analyzed  by  using  the  finite-state  Markov  chain  models 
presented  in  (27) .  As  a  result,  the  transition  probability  of 
states  between  two  adjacent  time  slots  can  be  given  by 

p(<rs,fc(f)ks,fc(*-  !))  =pfe,fc(f)),Vfc,V(f  ^  1) 
p(?fc,D(f)kfe,D(f  -  1))  =  p(?fc,D(f))-Vfe,  V(f  ^  1) 

According  to  the  equal  probable  steady  state  method,  the 
probability  of  each  state  at  time  slot  t  is  given  by 

p(<TS,fc(*))  =  p(9k,D(t))  =  l/m.Vfc,Vf  (41) 

To  optimize  the  energy  level  variation  design  (i.e.,  v(f),  Vf) 
before  transmission,  we  follow  a  similar  approach  as  the  pro¬ 
posed  layered  optimization  method  in  the  previous  section. 
First,  the  following  components  are  defined. 

6)  channel  state:  The  channel  state  is  de¬ 

fined  as  the  quantized  link  gains  of  all 
channels,  which  is  described  by  H(i)  = 

{ks,l(f)|,  •  '  •  ,  ks.JC (*) | ,  kl,D(f)|,  '  ’  '  ,  \SK,D{t)\} 

at  time  slot  t. 

With  a  channel  state  H(i),  the  payoff  defined  in  ED  can 
be  rewritten  as  i?'(S(f), H(f), D(f)),  which  is  obtained  via 
replacing  |ft*(i)|,  \gk{t)\  by  ks,fc (*) | ,  \9k,D (t)  |  in  ED-  As  a 
result,  the  expected  summation  of  payoffs  can  be  utilized  to 
help  the  system  make  decisions. 

7)  expected  summation  of  payoffs:  Similar  to  the  definition 
in  (l22l>.  the  expected  summation  of  payoffs  is  defined 


by 

U(S(t),  H(f),  D(f),  t)  =  R'(S(t),  H(f),  D(f),  t) 

+  E  {p(H(f  +  l))f/(S(f  +  l),H(f  +  l), 

H(t-t-l)  *- 

D(i  +  1),  (t  +  1))}  ,  V(i^T) 

U  (S  (T),H(T),D(T),T) 

=  R'  (S  (T) ,  H(T),  D  (T) ,  T) . 

(42) 

The  goal  is  to  maximize  the  expected  summation  of  pay¬ 
offs.  First,  the  embedded  optimization  problem  to  derive 
R'(S(t),  H(£),  D(f),  t)  can  be  solved  by  using  Algorithm  2, 
then  the  programming  problem  to  derive  the  optimized  en¬ 
ergy  level  variation  design  can  be  solved  by  using  backward 
induction.  In  particular,  the  Bellman  equation  is  described 
by  ED 

U*  (S  (t),  H(f),  t)  =  max{f?'(S(t),  H (t),  D (t),  t) 
v(t) 

+  E  Jr  U*  (S  (t  +  1),  H(t  +  1),  (t  +  1))},  V(t  ^  T ) 

H(t+1) 

(43) 

U*  (S  (T),  H(T),  T)  =  ft'(S(T),H(T),  S(T),  F (T), T), 

(44) 

where  the  — l—  in  (143b  is  derived  from  ED- 
R'(S(T) ,  H(T) ,  S (T) ,  I*  (T) ,  T)  in  ®  is  interpreted 
as  D(T)  =  (v(T),  I*(T))  =  (S(T),I*(T)),  which  implies 
that  the  stored  power  at  all  relays  should  be  used  up  at  the 
last  time  slot.  Then,  the  procedure  to  optimize  the  energy 
level  variation  design  before  transmission  is  concluded  in 
Algorithm  4. 


Algorithm  4  Expected  Summation  of  Payoffs  Optimization 
1:  Input:  The  set  of  states  Qs,k,  C k,D,Vk  according  to  the 
equal  probable  steady  state  method  (27) . 

2:  Initialization: 

3:  Set  the  time  index  as  t  =  T, 

4:  for  all  possible  channel  state  H(T) 

5:  Step  4  —  7  as  in  Algorithm  3  using  (l44t. 

6:  Repeat: 

7:  t  =  t  —  1; 

8:  for  all  possible  channel  state  H(£) 

9:  Step  11  —  16  as  in  Algorithm  3  using  (l43b. 

10:  Until:  t  =  1. 

11:  Return:  The  look  up  table 

{v*(S(l)),  H(l)),  •  •  •  ,v*(S(T)),H(T))}. 


By  using  Algorithm  4,  a  look  up  table  is  established 
before  transmission,  which  records  the  optimized  energy  level 
variation  designs  with  each  possible  channel  state  and  battery 
state  (i.e.,  (S(f),  H(f))).  After  that,  the  information  transfer 
design  is  updated  and  optimized  with  the  causal  CSI.  In 
particular,  at  each  transmission,  the  system  first  searched 
for  the  optimized  energy  level  variation  design  in  the  look 
up  table,  where  H(f)  is  obtained  by  mapping  the  accurate 
link  gains  to  the  quantized  link  gains.  Next,  the  information 
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transfer  design  at  this  time  slot  is  optimized  with  the  battery 
state,  the  accurate  link  gains  and  the  located  energy  level 
variation  design.  This  optimization  problem  has  the  same 
problem  formulation  as  the  embedded  optimization  problem 
in  d25l>  except  that  the  optimized  result  of  the  energy  level 
variation  design  is  obtained  from  the  look  up  table,  which 
can  be  handled  by  using  Algorithm  2. 

In  summary,  the  two-step  procedure  to  optimize  the  joint 
PS  and  battery  operation  design  with  a  finite-state  Markov 
chain  model  approach  is  described  in  Algorithm  5. 


Algorithm  5  Joint  Power  Splitting  and  Battery  Operation 
Optimizing  with  a  Finite-state  Markov  Channel  Model  Ap¬ 
proach 

1:  Input:  Causal  CSI  (i.e.,  fcfc(i),  Vfc  at  time  slot  t), 
and  the  look  up  table  through  algorithm  4. 

2:  Initialization: 

3:  Set  the  time  index  as  t  =  1,  and  the  battery  state  as 

S(f)  =  {0,  •  •  •  ,  0}, 

4:  Set  the  optimized  energy  level  variation  solution  at  the 
last  time  slot  as  v*(i  —  1)  =  {0,  •  •  •  ,0}. 

5:  Repeat: 

6:  Update  the  battery  state  S(t)  according  to  (IT9l>  with 

v*(f-l), 

7:  Evaluate  the  channel  state  H(t)  by  mapping 

fcfc  (^)  7  9k  (f') ;  Vfc  to  Cs,fc ?  1  Vfc, 

8:  Search  for  the  optimal  energy  level  variation  design 

v*  (t)  from  the  look  up  table, 

9:  Deduce  the  optimized  information  transfer  design  I*  (t) 
using  Algorithm  2  with  S(t)  and  v*(t), 

10:  Update  t  =  t  +  1. 
it:  Until:  t  =  T. 

12:  Return:  The  optimized  joint  PS  and  battery  operation 
design  (v*(t),  I*(t))  at  time  slot  t. 


B.  A  Greedy  Method  Approach 


To  utilize  the  statistical  properties  of  CSI,  a  look  up  table 
is  established  in  the  above  finite-state  Markov  chain  model 
approach,  which  requires  high  computation  complexity  when 
the  number  of  channel  states  is  large.  To  lower  the  com¬ 
plexity,  a  greedy  method  that  requires  no  information  of 
the  subsequent  time  slots  is  proposed.  The  key  idea  is  to 
maximize  the  throughput  at  each  transmission,  and  ignore  the 
effects  on  the  subsequent  time  slots.  As  a  result.  Problem  (PI) 
is  approached  by  time  decoupled  throughput  optimization 
problems  at  each  time  slot,  where  the  formulated  optimiza¬ 


tion  problem  at  time  slot  t  is  given  by 

(P5)  :  max  f?(v(f),  1(f))  =  |log  (1  +  SNR  (t)) 

v(t),I(t) 

s.t.  0  <  XkJ  (t)<  1  +  min  {0,  m  (t)|!»  }  ,  Vfc 

vk  (f)  G  {-L,  ,  0,  -  -  -  ,L},  Vfc 

Bk  ( t )  -  Bmax  —  vk  ( t )  <  Bk  (f) ,  Vfc 

Bk  (t)  =  Bk(t-l)-v*k(t-l),  Vfc 

(45) 

where  vk(t  —  1)  is  the  optimized  energy  level  variation  design 
at  the  last  time  slot,  and  the  fourth  constraint  implies  that  the 
energy  level  variation  design  results  in  evolution  of  battery 
states,  which  is  derived  from  Since  the  goal  is  to 

maximize  R(v(t),  1(f)),  while  the  effects  of  design  on  the 
subsequent  time  slots  are  not  evaluated,  the  following  insight 
can  be  given  to  optimize  the  energy  level  variation  design. 

Theorem  1:  If  all  batteries  are  empty  initially,  the  opti¬ 
mized  joint  PS  and  battery  operation  design  using  the  greedy 
method  is  to  use  up  the  harvested  energy  at  each  transmission. 

Proof:  First,  similar  to  the  formulation  of  (l25l). 
since  R(y(t),  1(f))  is  monotonically  increasing  in  terms  of 
SNR(t),  Problem  (P5)  is  equivalent  to  the  SNR  maximiza¬ 
tion  problem  [  19], 

The  theorem  is  proved  step  by  step.  At  the  first  time  slot, 
all  batteries  are  empty  (i.e.,  Bk{  1)  =  0,  Vfc).  Since  the  power 
provided  by  battery  discharging  cannot  exceed  the  power 
stored  in  the  battery  (i.e.,  bk,F(t)  <  Bk(t),  Vfc,  Vi),  the  power 
discharged  at  time  slot  1  is 

6fc,F(l)  =  O.Vfc  (46) 

To  maximize  the  optimization  objective  SNR(  1),  the  PS 
ratio  for  battery  charging  should  satisfy 


Afc,s(l)  =  0,  Vfc  (47) 

because  otherwise  SiVii(l)  can  be  further  improved  by 
reassigned  the  PS  ratios  that:  X'k  B(l)  =  0,  AfejF(l)  remains 
the  same,  Xk  7(1)  =  Afci/(1)  +  Afc,s(l),  which  can  be  seen 
according  to  CD,  ©,  ©.  and  CD-  Thus,  no  signal  power  is 
spit  for  battery  charging  and  the  harvested  energy  is  used  up 
at  this  transmission.  As  a  result,  each  battery  stays  empty  at 
the  next  time  slot. 


Bk( 2)  =  Bk  (1)  +  mV2Xk,B  (1)  | yk  (1)|2  -  bk.F  (1)  =  0,  Vfc 

(48) 

which  is  derived  from  (TTot. 

Assuming  that  the  harvested  energy  is  used  up  at  time 
slot  (t  —  1)  and  each  battery  stays  empty  at  time  slot  t  (i.e., 
Bk(t)  =  O.Vfc),  then  similar  to  the  above  analysis,  it  can  be 
derived  that 


h,F(t)  =  0  ,Vfc 
A  fc,s(f)  =  O.Vfc 


(49) 


In  conclusion,  the  optimized  joint  PS  and  battery  operation 
design  using  the  greedy  method  is  to  use  up  the  harvested 
energy  at  each  time  slot.  □ 
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Based  on  Theorem  1  and  the  definition  of  energy  level 
variation  in  <U3,  it  can  be  derived  that 

vk*  (t)  =  0.  Vfc,Vi  (50) 

Substituting  (l50l>  in  Problem  (P5),  the  optimization  problem 
can  be  rewritten  as 

(P6)  :  max  ^log  (1  +  SNRi  (t)) 

m  2  (51) 

s.t.  0  <  A kj  (t)  <  1,  Vfc 

where  SNRi  (t)  is  obtained  via  replacing  Vk  (t)  by  0  in  (ITTI). 
As  a  result.  Problem  (P6)  has  the  same  problem  formulation 
as  the  embedded  optimization  problem  in  (1251  and  can  be 
solved  by  using  Algorithm  2.  In  summary,  the  procedure 
to  optimize  the  joint  PS  and  battery  operation  design  with 
a  greedy  method  is  described  in  Algorithm  6,  which  solves 
Problem  (P6)  at  different  time  slots  step  by  step. 

Algorithm  6  Joint  PS  and  Battery  Operation  Optimizing  with 
a  Greedy  Method 

l:  Input:  Causal  CSI  (i.e.,  hk(t),  gk{t),Vk  at  time  slot  t). 

2:  Initialization: 

3:  Step  3, 4  as  in  Algorithm  5. 

4:  Repeat: 

5:  Set  the  optimized  energy  level  variation  design  as 

v*  (t)  =  {0,  •  •  •  ,  0}, 

6:  Step  6,  9, 10  as  in  Algorithm  5  to  solve  Problem  (P6). 

7:  Until:  t  =  T. 

8:  Return:  The  optimized  joint  PS  and  battery  operation 
design  (v*(t),/*(f))  at  time  slot  t. 


Note  that,  Theorem  1  indicates  that  the  battery  is  not  uti¬ 
lized  in  the  proposed  greedy  method,  such  that  a  lower  bound 
on  performance  of  the  proposed  harvest-use-store  PS  relaying 
strategy  is  provided.  Moreover,  the  solution  in  Theorem  1  for 
the  wireless-powered  cooperative  relaying  is  different  from 
the  solution  for  the  conventional  cooperative  relaying  fl9l. 
where  not  all  relays  should  use  up  their  available  power. 
In  particular,  a  tradeoff  between  signal  amplification  and 
noise  amplification  needs  to  be  reached  in  the  amplify- 
and-forward  relaying,  because  the  transmit  power  not  only 
amplifies  the  useful  signal  but  also  the  noise.  This  tradeoff 
is  realized  by  controlling  the  transmit  power  at  the  relays  in 
conventional  cooperative  relaying,  thus  only  the  relays  with 
good  channel  conditions  should  use  up  the  transmit  power. 
However,  this  tradeoff  is  reached  by  optimizing  the  PS  ratios 
in  the  wireless-powered  case,  because  the  harvested  energy 
obtained  through  PS  is  the  only  power  source  for  relays.  As 
a  result,  the  reason  to  store  a  part  of  the  harvested  energy 
is  to  improve  the  performance  at  subsequent  time  slots  and 
reach  a  better  overall  throughput. 

C.  Computational  Complexity 

The  computational  complexity  of  the  proposed  two  on¬ 
line  algorithms  and  exhaustive  searching  are  compared  in 


TABLE  I 

Computational  Complexity  Comparison 


Computational  Complexity 

Exhaustive  searching 

o(c2(MT) 

Algorithm  5 

0(ncC2T{N2)) 

Algorithm  6 

0(C2T) 

this  subsection.  Note  that,  all  the  three  algorithms  utilize 
Algorithm  2  to  solve  the  “embedded  problem”,  thus,  their 
computational  complexity  can  be  compared  by  evaluating  the 
times  Algorithm  2  is  used.  Assuming  that  the  computational 
complexity  of  Algorithm  2  is  C2,  the  total  number  of  time 
slots  is  T,  the  number  of  channel  states  is  nc,  the  number 
of  battery  states  is  N.  According  to  the  constraints  C 2  and 
C 3  in  m.  the  number  of  battery  operations  is  also  N.  As  a 
result,  the  computational  complexities  of  the  three  algorithms 
can  be  seen  in  Table  I.  It  can  be  seen  that  the  complexity  of 
exhaustive  searching  is  of  order  0((N2)T),  which  is  because 
the  evaluation  of  battery  operations  through  T  time  slots 
are  coupled  over  time.  The  complexity  of  algorithm  5  is  of 
order  0(N2)  because  dynamic  programming  adopted  in  this 
algorithm  is  a  time  decoupled  procedure.  Besides,  nc  is  due 
to  channel  state  information  prediction  through  Markov  chain 
model,  which  makes  dynamic  programming  feasible.  More¬ 
over,  the  complexity  of  Algorithm  6  is  of  order  0((N2)0), 
which  is  because  the  optimized  battery  operation  in  greedy 
algorithm  is  set  to  be  0  for  each  relay  at  each  time  slot 
according  to  Theorem  1.  In  conclusion.  Algorithm  5  enjoys 
a  lower  computational  complexity  than  exhaustive  searching, 
and  Algorithm  6  is  of  lower  computational  complexity  than 
Algorithm  5. 

V.  Numerical  Results 

In  this  section,  the  proposed  harvest-use-store  PS  relay¬ 
ing  strategy  is  evaluated  with  numerical  results.  The  two- 
hop  channels  are  modeled  by  \psk\  dg%K  and  \pkD\2 
respectively,  where  k  is  the  index  for  relays.  To  simplify 
simulations,  the  distance  between  the  source  node  and  each 
relay  is  normalized  as  dsk  =  lm,  while  the  distance  between 
each  relay  and  the  destination  node  is  d^D  =  5m,  a  similar 
simulation  scenario  can  be  seen  in  [9],  Besides,  \pi\  denotes 
the  short-term  channel  fading,  and  is  assumed  to  be  Rayleigh 
distributed,  p,  \ 1  follows  the  exponential  distribution  with 
unit  mean.  The  noise  powers  at  each  relay  and  the  destination 
are  set  as  o2  =  a2D  =  <r  =  1.  The  average  signal-to-noise 
ratio  (SNR)  is  defined  as  where  P  is  the  maximum 
transmit  power  at  the  source  node.  Besides,  the  energy 
conversion  efficiency  is  set  as  rp  =  0.4,  the  storage  efficiency 
is  r] 2  =  0.8,  and  the  coefficient  for  the  battery  size  is 
a  =  1.  For  the  purpose  of  presentation,  the  proposed  three 
joint  PS  and  battery  operation  designs  are  termed  as  the 
optimal  design,  the  general  design,  and  the  greedy  design, 
respectively. 
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A.  Convergence  of  the  Proposed  Designs 

Convergence  behavior  of  the  adopted  iterative  algorithms 
is  depicted  in  Fig.  [3  where  the  “a  —  b  iterations”  in  this 
figure  denotes  a  iterations  in  the  alternating  optimization  and 
b  iterations  in  the  Dinkelbach  algorithm.  Note  that,  these  two 
iterations  in  Algorithm  1  and  Algorithm  2  are  adopted  in  all 
three  proposed  designs,  thus  only  the  convergence  behavior 
of  the  optimal  design  is  presented  due  to  limited  space.  We 
set  the  number  of  relays  as  K  =  2,  the  number  of  energy 
levels  as  L+l  =  10,  the  total  number  of  time  slots  as  T  =  5, 
and  the  number  of  Markov  states  as  m  =  3.  It’s  revealed 
that  the  alternating  optimization  algorithm  converges  within 
5  iterations,  and  so  is  the  Dinkelbach  algorithm. 


Fig.  2.  Convergence  of  the  proposed  optimal  design  (K  =  2,  T  =  5,  and 
Lm  9). 


B.  Performance  Comparison  of  the  Proposed  Designs 

To  demonstrate  the  performance  comparison  of  the  three 
proposed  designs,  Fig.  [3  is  provided  as  below.  This  figure 
demonstrates  the  average  throughput  performance  of  different 
joint  PS  and  battery  operation  designs  versus  the  SNR,  where 
the  number  of  relays  is  K  =  2,  the  number  of  energy  levels 
is  L  +  1  =  5,  the  total  number  of  time  slots  is  T  =  10, 
and  the  number  of  Markov  states  is  to  =  3.  The  exhaustive 
searching  will  determine  the  optimal  joint  PS  and  battery 
operation  design  with  non-causal  CSI  numerically.  First,  the 
performance  of  the  proposed  optimal  design  is  indistinguish¬ 
able  from  the  optimal  one  derived  from  exhaustive  searching, 
which  provides  an  upper  bound  on  performance  for  the 
proposed  strategy.  Next,  the  performance  of  the  proposed 
greedy  design  provides  a  lower  bound  on  performance  for  the 
proposed  strategy  which  does  not  need  a  battery.  Moreover, 
the  results  reveal  that  by  utilizing  the  non-causal  CSI,  or  to  a 
lesser  extent  the  statistical  information  of  CSI,  the  throughput 
performance  of  the  proposed  strategy  can  be  improved. 


Fig.  3.  Average  throughput  performance  with  different  joint  PS  and  battery 
operation  designs  (K  =  2,  T  =  10.  m  =  3  and  L  =  4). 

C.  Performance  Comparison  with  Conventional  Strategies 

To  reveal  the  advantages  of  using  distributed  beamforming 
in  the  proposed  strategy,  performance  comparison  using 
different  precoding  techniques  are  depicted  in  Fig.  [4] in  terms 
of  system  throughput.  We  set  the  number  of  relays  to  be 
K  =  3,  the  number  of  energy  levels  to  be  L  + 1  =  5,  and  the 
total  number  of  time  slots  to  be  T  5.  It  can  be  seen  that  the 
proposed  designs  with  distributed  beamforming  outperform 
the  optimized  joint  PS  and  battery  operation  designs  with 
both  the  best  relay  selection  and  random  relay  selection. 


Fig.  4.  Average  throughput  performance  with  different  precoding  tech¬ 
niques,  with  K  =  3,  T  =  5  and  L  =  4. 

To  reveal  the  advantages  of  using  PS  receiving  architecture 
in  the  proposed  strategy,  performance  comparisons  using  dif¬ 
ferent  energy  harvesting  receiving  architectures  are  illustrated 
in  Fig.  [3  and  Fig.  [6]  Specifically,  the  average  throughput 
performance  versus  SNR  has  been  depicted  in  Fig.  [3  where 


IEEE  JOURNAL  ON  SELECTED  AREAS  IN  COMMUNICATIONS,  VOL.  34,  NO.  X,  JAN.  2016 


13 


the  number  of  relays  is  K  =  2,  the  number  of  energy  levels 
is  L  +  1  =  5,  and  the  number  of  time  slots  is  T  =  10. 
It  is  revealed  that  our  proposed  strategy  outperforms  time 
switching-based  relaying  in  terms  of  system  throughput. 


Fig.  5.  Average  throughput  performance  with  different  energy  harvesting 
receiving  architectures  (K  =  2,  T  =  10,  and  L  =  4). 

Moreover,  performance  comparison  in  terms  of  delay  con¬ 
straint  services  is  demonstrated  in  Fig.  6,  where  the  number 
of  relays  is  K  =  2,  the  number  of  energy  levels  is  L  + 1  =  5, 
the  bandwidth  is  1MHz,  and  the  average  SNR  is  10 dB. 
In  this  figure,  the  minimum  required  slots  to  finish  data 
transmission  of  a  delay  constraint  service  is  depicted.  It’s 
revealed  that,  the  proposed  strategy  with  PS  consumes  fewer 
time  slots  compared  with  the  time  switching-based  relaying, 
which  indicates  that  the  proposed  strategy  is  more  suitable 
for  applications  with  critical  delay  constraints. 


Fig.  6.  Minimum  required  time  slots  versus  data  size  of  a  delay  constraint 
services  (K  =  2,  L  =  4). 

To  show  the  advantages  of  using  harvest-use-store  model 
in  the  proposed  strategy,  performance  comparison  using 


different  power  management  models  are  illustrated  in  Fig. 
[7]  in  terms  of  system  throughput,  where  the  number  of  relays 
is  K  =  2,  the  number  of  energy  levels  is  L  +  1  =  10,  and 
the  number  of  time  slots  is  T  =  5.  Note  that,  the  proposed 
strategy  outperforms  the  one  with  harvest-store-use  model 
because  the  proposed  strategy  avoids  unnecessary  storage 
loss  at  relay  batteries  ED.  Moreover,  the  proposed  strategy 
outperforms  the  one  with  harvest-use  model  because  the 
harvested  energy  can  be  accumulated  for  future  usage,  which 
realizes  a  more  efficient  utilization  of  harvested  energy  03. 


Fig.  7.  Average  throughput  performance  with  different  power  management 
models  (K  =  2,  T  =  5,  and  L  =  9). 


VI.  Conclusion 

In  this  paper,  to  support  an  efficient  utilization  of  harvested 
energy  to  improve  throughput  for  wireless-powered  multi¬ 
relay  cooperative  networks,  a  harvest-use-store  power  split¬ 
ting  (PS)  relaying  strategy  with  distributed  beamforming  has 
been  researched.  Since  the  formulated  throughput  maximiza¬ 
tion  problem  is  intractable,  a  layered  optimization  method  has 
been  proposed  to  decompose  the  joint  PS  and  battery  oper¬ 
ation  design  in  two  layers,  which  transforms  the  intractable 
optimization  problem  into  a  dynamic  programming  problem 
with  a  subproblem  requiring  optimization  embedded  in  it. 
The  layered  optimization  method  has  been  implemented  in 
the  non-causal  channel  state  information  (CSI)  case,  which 
leads  to  a  theoretical  bound  of  the  proposed  strategy,  and 
extended  to  the  general  causal  CSI  case.  To  achieve  a  trade¬ 
off  between  performance  and  complexity,  a  greedy  method 
has  been  proposed  to  optimize  the  joint  PS  and  battery 
operation  design  with  causal  CSI.  Simulation  results  have 
shown  that  the  proposed  harvest-use-store  PS-based  relaying 
strategy  outperforms  time  switching-based  relaying  strategy 
and  conventional  PS-based  relaying  strategy  without  energy 
accumulation.  This  work  will  be  extended  to  full-duplex  relay 
mode  in  our  future  works. 
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Appendix  A 
PROOF  OF  LEMMA  2 

From  (l32l>.  the  second-order  derivative  of  the  objective 
function  in  Problem  (P4)  is  derived  as 

^Vi\9j\2  (<?+!) 


E 


m\gk\2(ak-xk) 


(aj-Xj)l  1- 


/ - 2  ,  2  E  \  Vi\gk\2(ak-Xk)(l-^r^\ 

■W£toF^-i  ; 


Since 


d2  (F1{xj)-qF2(xj))  , 


<  0. 


(52) 


dXn2 


is  non-positive,  the  optimization 


objective  in  Problem  (P4)  is  a  concave  function  in  terms  of 
Xj.  This  completes  the  proof  of  Lemma  2. 


Appendix  B 
PROOF  OF  LEMMA  4 

At  each  iteration  of  Algorithm  2,  one  of  the  x3 .  Vj  is 
updated,  whose  value  is  derived  by  using  Algorithm  1. 
Note  that,  the  optimal  solution  to  Problem  (P3)  is  derived 
from  Algorithm  1,  thus  when  updating  Xj,  the  following 
relationship  is  derived  that 


max 

Xj 


Fi(xj) 

F2(xj) 


O(z') 

F2(x') 


J(x  1,  •  •  •  ,XK) 


>  J{x !,•••  ,x j,---  ,XK),  Vj 


(53) 


which  implies  that  the  optimization  objective  J  in  Problem 
(P2)  increases  after  each  iteration.  Moreover,  there  exist  an 
tipper  bound  for  J  that 


feE  m\gk\2  (ak  -  rf)  (i  ^  ft)) 


J upper 


K 


(54) 


E  Vl\9k\  {ak  -  erg)  ^  +  0-2 


fc= 1 


where  bk  =  P\hk  (i)|2  +  of.  This  upper  bound  is  formulated 
by  substituting  A k,i  =  1  and  A k,F  +  A^s  =  1  into  CD 
and  simplifying  (II  It  in  a  manner  similar  to  the  formulation 
of  (|29|>,  which  is  the  theoretical  ideal  solution  but  may  not 
be  practical  as  explained  in  Il26l.  The  practical  restriction 
A k,i  +  A k,F  +  A k,B  =  1  is  adopted  in  our  paper,  and 
thus  the  optimized  performance  cannot  outperform  the  upper 
bound.  In  conclusion,  after  each  iteration  in  Algorithm  2, 
the  optimization  objective  in  Problem  (P2)  is  increased  and 
finally  approaches  the  upper  bound.  Thus,  Algorithm  2 
converges. 
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